A Notation for Markov Decision Processes

نویسنده

  • Philip S. Thomas
چکیده

Many reinforcement learning (RL) research papers contain paragraphs that define Markov decision processes (MDPs). These paragraphs take up space that could otherwise be used to present more useful content. In this paper we specify a notation for MDPs that can be used by other papers. Declaring the use this notation using a single sentence can replace several paragraphs of notational specifications in other papers. Importantly, the notation that we define is a common foundation that appears in many RL papers, and is not meant to be a complete notation for an entire paper. We refer to our notation as the Markov Decision Process Notation, version 1 or MDPNv1. It can be invoked in research papers with the sentence:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Integrating Processes, Cases, and Decisions for Knowledge-Intensive Process Modelling

Knowledge-intensive processes require flexibility and scalability in modelling, as well as profound integration of data and decisions into the process. Business Process Model and Notation (BPMN) is a pertinent modelling method for processes. Until lately decisions were regularly modelled as a part of the process model in intertwined paths and gateways, negatively affecting the maintainability, ...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

On $L_1$-weak ergodicity of nonhomogeneous continuous-time Markov‎ ‎processes

‎In the present paper we investigate the $L_1$-weak ergodicity of‎ ‎nonhomogeneous continuous-time Markov processes with general state‎ ‎spaces‎. ‎We provide a necessary and sufficient condition for such‎ ‎processes to satisfy the $L_1$-weak ergodicity‎. ‎Moreover‎, ‎we apply‎ ‎the obtained results to establish $L_1$-weak ergodicity of quadratic‎ ‎stochastic processes‎.

متن کامل

A Unifying Perspective of Parametric Policy Search Methods for Markov Decision Processes

Parametric policy search algorithms are one of the methods of choice for the optimisation of Markov Decision Processes, with Expectation Maximisation and natural gradient ascent being popular methods in this field. In this article we provide a unifying perspective of these two algorithms by showing that their searchdirections in the parameter space are closely related to the search-direction of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1512.09075  شماره 

صفحات  -

تاریخ انتشار 2015